Language Independent Morphological Analysis

نویسندگان

  • Tatsuo Yamashita
  • Yuji Matsumoto
چکیده

This paper proposes a framework of language independent morphological analysis and mainly concentrate on tokenization, the first process of morphological analysis. Although tokenization is usually not regarded as a difficult task in most segmented languages such as English, there are a number of problems in achieving precise treatment of lexical entries. We first introduce the concept of morpho-fragments, which are intermediate units between characters and lexical entries. We describe our approach to resolve problems arising in tokenization so as to attain a language independent morphological analyzer.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Morphological analysis for less-resourced languages: Maximum Affix Overlap applied to Zulu

The paper describes a collaboration approach in progress for morphological analysis of less-resourced languages. The approach is based on firstly, a language-independent machine learning algorithm, Maximum Affix Overlap, that generates candidates for morphological decompositions from an initial set of language-specific training data; and secondly, language-dependent post-processing using langua...

متن کامل

Remote Elicitation of Inflectional Paradigms to Seed Morphological Analysis in Low-Resource Languages

Structured, complete inflectional paradigm data exists for very few of the world’s languages, but is crucial to training morphological analysis tools. We present methods inspired by linguistic fieldwork for gathering inflectional paradigm data in a machine-readable, interoperable format from remotely-located speakers of any language. Informants are tasked with completing language-specific parad...

متن کامل

Morphological Analysis as Classification: an Inductive-Learning Approach

Morphological analysis is an important subtask in text-to-speech conversion, hyphenation, and other language engineering tasks. The traditional approach to performing morphological analysis is to combine a morpheme lexicon, sets of (linguistic) rules, and heuristics to find a most probable analysis. In contrast we present an inductive learning approach in which morphological analysis is reformu...

متن کامل

Joint Voice Harmony Restoration and Morphological Segmentation for Uyghur Morphological Analysis

Morphological analysis is an fundamental task of Uygur language information processing. In order to solve the problem of error propagation in traditional morphological analysis method which carries out the voice harmony restoration problem at first and then the morphological segmentation, this paper presents a union method combining voice harmony restoration and morphological segmentation. This...

متن کامل

Character-Aware Neural Morphological Disambiguation

We develop a language-independent, deep learning-based approach to the task of morphological disambiguation. Guided by the intuition that the correct analysis should be “most similar” to the context, we propose dense representations for morphological analyses and surface context and a simple yet effective way of combining the two to perform disambiguation. Our approach improves on the languaged...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000